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Software Engineering Research 

Software engineering researchers are building tools, defining methods and models. 
However, there are problems with the nature and style of the research. The research is 
typically bottom-up, done in isolation so the pieces cannot be easily logically or 
physically integrated. A great deal of the research is essentially the packaging of a 
particular piece of technology with little indication of how the work would be integrated 
with other pieces of research. The research is not aimed at solving the real problems 
of software engineering, i.e., the development and maintenance of quality systems in 
a productive manner. The research results are not evaluated or analyzed via 
experimentation or refined and tailored to the application environment. Thus, it cannot 
be easily transferred into practice. Because of these limitations we have not been able 
to understand the components of the discipline as a coherent whole and the 
relationships between various models of the process and product. 

What is needed is a top down experimental, evolutionary framework in which research 
can be focused, logically and physically integrated to produce quality software 
productively, and evaluated and tailored to the application environment. This implies 
the need for experimentation, which in turn implies the need for a laboratory that is 
associated with the artifact we are studying. This laboratory can only exist in an 
environment where software is being built, i.e., as part of a real software development 
and maintenance organization. Thus we propose that Software Engineering 
Laboratory (SEL) type activities exist in all organizations to support software 
engineering research. 

In this paper we will try to describe the SEL from a researcher's point of view. Jerry 
Page and Frank McGarry will discuss the corporate and government benefits of the 
SEL I will try to focus my discussion on the benefits to the research community. 

The SEL as a Research Laboratory 

The SEL is a laboratory that allows us to understand the various software processes, 
products and other experiences, build descriptive models of them, understand the 
problems associated with building software, develop solutions focused on the 


1 


V. Basili 

Unir. of Maryland 
Page 1 of 42 


problems, experiment with the proposed solutions and analyze and evaluate their 
effects, refine and tailor these solutions for continual improvement and effectiveness 
and enhance our understanding of their effects, and build relevant models of software 
engineering experiences. 

The SEl has been in business for over 15 years and, based upon our experiences, its 
activities have evolved over time. In this section, I will describe the activities as they 
progressed over three phases. 

The first phase! will call the understanding phase because we worked on 
understanding what we could about the environment and measurement. During this 
period we measured what we could, used available models to explain the 
environment and our behavior, and built descriptive baselines and models typifying 
our environment. 

In retrospect we made several mistakes. We collected too much data, i.e., because we 
did not know what was important we tended to collect all kinds of data hoping they 
would give us insights into the environment. We often blindly applied models and 
metrics without understanding the subtle assumptions and whether they were relevant 
in our environment. In a sense, we tried to evaluate things before we had built a deep 
understanding of what we were evaluating. We finally began to understand that 
measurement needed to be based upon models and goals. We established goals and 
a mechanism for generating measures based upon those goals, the first, primitive 
version of the Goal/Question/Metric Paradigm. This provided an informal approach to 
organizing our data. Based upon our goals, we began to build environment specific 
models by accumulating knowledge on individual projects and building baselines 
across multiple projects. Eventually we developed descriptive models that 
characterized the environment. These models included models of resources, defects, 
and product characteristics. 

Once we had an understanding or characterization of the environment and the 
projects we were developing, we were able to begin the process of evaluation by 
comparing new projects against our baselines. This allowed us to proceed to phase 
two where the focus was on improving the process, product, and environment. 
During this phase, we continued to build up our data base of baselines and models, 
but we also evaluated and fed back information to the project. Many of these early data 
models were informal. The data was saved in a data base but the models existed 
mostly in documents. We began to experiment with various technologies to 
understand their effect, i.e. how they changed the baselines or the models we had. In 
order to provide a learning process across projects that would allow us to take 
advantage of what we had learned and evolve, we developed the Quality 
Improvement Paradigm, which is based upon an evolutionary, experimental approach 
to software improvement based upon both project and organizational feedback loops. 
The Goal/Question/Metric Paradigm continued to evolve to recognize different types of 
goals and questions and take advantage of the multi-project perspective. We began 
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formafizing process, product, knowledge and quality models. 

This need for formalization within the context of the Improvement Paradigm led to the 
concept of packaging models of our experiences so they were reusable on other 
projects During this third phase we worked on choosing potentially reusable 
experiences, recognizing what was appropriate and relevant for the SEL. We began 
studying notations and mathematical formalisms for defining experiences. 

There are several examples of current research projects in packaging experiences. 

For example, we are working on a project characterization model that allows us to 
recognize project patterns so that we can predict which projects look like the one we 
are working on. This allows us to package data for use as cost estimation models 
based upon our relevant past history [Briand, Basili, Thomas]. Having recognized that 
most experiences need to be modified for use, we have been defining models of 
tailorable experiences. For example, we are working on a tailorable test method 
[BasiB Martschenko, Swain). The method allows one to choose the appropriate test 
techniques based upon the defect history of similar projects and the success rate of 
the techniques in that environment. Another example is the development of a model or 
reference architecture for different types of software factories [Basili, Caldiera and 
Cantonel. We are defining process models for reusing experience. We have 
developed a reuse- oriented evolution model [Basili and Rombach] and are working 
on integrating experience models [Oivo and Basili). We have developed the concept of 
an Experience Factory, whose goal is to package software experiences and provide 
them to projects upon demand and have integrated the concept with an evolved QIP 

and GQM. 
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Figure 1. Evolution of Measurement/Studies in the SEL 
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Figure 1 represents some of the studies we performed and the hierarchy of the 
process, one phase based upon the other. That is, there was an understanding 
process (Phase 1), followed by an improving process (Phase 2), followed by a 
packaging process (phase 3). You can't improve until you understand, and you can’t 
package until you can assess and improve. We are still understanding and trying to 
improve; these activities, along with packaging, will go on forever. 

The Research Framework Concepts 

We have evolved to a framework [Basili b] that is based on three basic concepts, each 
of which is itself evolving: 

o The Quality Improvement Paradiom (QIP), an evolutionary improvement paradigm, 
based upon the scientific method, tailored for the software engineering, 

O The Goal/Questiop/Metric (GQM) paradigm, an approach for establishing project, 
corporate, and research goals and a mechanism for measuring against those goals, 

o The Expsnsnca Factory I 3n organization that supports research and development 
by studying projects, developing and refining models, and supplying them to projects 
for further analysis and refinement. 

The Quality Improvement Paradigm consists of the following steps: 

o Characterize the current project and its environment with respect to a vanety of 
models. 

o Set the quantifiable goals for successful project performance and improvement. 

o Choose the appropriate process model and supporting methods and tools for this 
project. 


o Execute the processes, construct the products, collect and validate the prescribed 
data, and analyze it to provide real-time feedback for corrective action. 

o Analyze the data to evaluate the current practices, determine problems, record 
findings, and make recommendations for future project improvements. 

o Package the experience in the form of updated and refined models and other forms 
of structured knowledge gained from this and prior projects and save it in an 
experience base so it represents our current stale of knowledge and is available for 
future projects. 


The research emphasis is on taking each of these issues associated with the QIP, (e.g., 
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characterizing, goal setting, choosing process, executing, analyzing, and packaging), 
and formalizing and integrating them. Each of these steps has evolved over the years. 
We have been building models of characterization. For example, what are good 
models that allow me to recognize what kind of software project I have and what 
projects are similar? Based on data, we are using pattern recognition techniques to 
recognize where to find the most appropriate kinds of experiences related to the 
current project [Briand, Basili, Thomas]. 

Goal setting has become a process of integrating models. A goal typically takes the 
form of analyzing some form of object from some perspective. I need models of both 
the object of study and the various perspectives of interest on that object. 

We want to choose processes. A key issue here is that process is a variable; that I 
need to select, manipulate and change processes based on the characterization of the 
project and the environment and the goals established for this particular project. 

Execution needs automated support. An automated system. SME, has been 
developed to support the accessing of data in a packaged form. The analysis and 
packaging issues are the major focuses of this paper. 

The Goal/Questlon/Metric Paradigm is a mechanism for defining and interpreting 
operational and measurable software goals. Goals may be defined for any object, for a 
variety of reasons, with respect to various models of quality, from various points of 
view, relative to a particular environment. A particular GQM model combines models c: 
an object of study , e.g., a process, product, or any other experience model and one qs 
more focuses, e.g., models aimed at viewing the object of study for particular 
characteristics, such as models of cost, correctness, defect removal, changes, 
reliability, user friendliness, etc. This implies that there are models of these quality 
perspectives developed and available for use at anytime. 

These models can be analyzed from a point of view, e.g., the perspective of the person 
needing the information, which orients the type of focus and when the interpretation of 
the information is made available and for any purpose, e.g., characterization, 
evaluation, prediction, motivation, improvement, which specifies the type of analysis 
necessary. 

The result is a GQM model relative to a particular environment. Environments are 
distinguished based upon a variety of factors, e.g., problem factors, people factors, 
resource factors, process factors, etc. 

Experimental Approaches 

Given a form of the scientific method, in the guise of the QIP, a mechanism to generate 
research hypotheses, in the guise of the GQM, what kinds of experimentation can we 
perform? The chart in Figure 2 offers four dasses of studies that we can and have 
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performed. The approaches can be characterized by the number of teams replicating 
each project and number of different projects analyzed. 
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Figure 2. Classes of Studies and Scopes of Evaluation 

The single project case study is where most people begin. There is a project and 
someone has decided to study it. The results can provide some insight into project 
development in the environment. 

A multi-project variation type study involves the measurement of several projects 
where factors, such as a method, can be varied across similar type projects. This 
allows the experimenter to study the effects of variations to the extent that the 
organization allows them to vary on different projects. In fact, that’s literally what we do 
in the SEL We have a large number of projects, we have standard baseines of how 
things should happen, and we start to perturb them by making changes and studying 
the effects of those changes. 

The replicated project study involves several replications of the same project by 
different subjects. Each of the issues studied is applied to the project by several 
subjects but each subject applies only one of the technologies. It permits the 
experimenter to establish control groups. 

The blocked subject-project study allows the examination of several factors within the 
framework of one study. Each of the issues studied is applied to a set of projects by 
several subjects and each subject applies each of the technologies under study. It 
permits the experimenter to control for differences in the subject population as well as 
study the effect of the particular projects. 
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There are two problems with the controlled types of experiments: (1) they are rather 
expensive and (2) if done for large pieces of software, for example, one year duration 
projects, they are hard to control, especially over several replications. Therefore, 
even though these types of experiments generate stronger confidence in the results 
than the non-controlled type experiments, they must be performed on small projects so 
the results do not scale up. If, however, these experiments are run on a small scale 
achieving reasonable statistical results, then there is motivation to experiment with the 
technologies on a larger scale in either a case study or a multi-project variation. 
Combining the results of the controlled experiment and the large- scale case study or 
multi-project variation, we can gain confidence in the validity of the experimental 
results. 

It is clear in the SEL that we are avid believers in experimentation. We do not believe 
that any technology, method, tool, process model, etc. works under all circumstances. 
Everything has limits, areas where it works well or poorly. If we are dealing with 
technologies, we know they have limits. Experimentation is important in 
understanding those limits. 
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Figure 3. Example Classes of Studies 

Figure 3 contains several example studies we have performed in the SEL T hese 
studies cut across various experimental classes. When we have found something 
effective as a case study, we eventually turn it into a multi-project variation because it 
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is effective for the environment. 


An Example Set of Studies 

As an example of an effective process with which we have performed multiple types of 
experiments, consider the Cleanroom approach to software development, as 
suggested by Harlan Mills. We first ran a replicated project study at the University of 
Maryland that showed that the approach was very effective. We then decided to run a 
case study here in the SEL, which again was successful. We have since begun two 
new projects using the approach and will eventually have enough projects for an 
analysis based upon multi-project variation. 

The key elements of the Cleanroom Process [Dyer], include a mathematically-based 
design methodology which includes: function specification for programs, state 
machine specification for modules, reading by stepwise abstraction, correctness 
demonstrations when needed, and top-down development. The implementation is 
done without any on-line testing by the developer. There is statistically-based, 
independent testing, based on anticipated operational use. Testing is done from a 
quality assurance orientation. 

The replicated Cleanroom study had as its goals to evaluate the Cleanroom process 
with respect to its effects on the process, product and developers relative to differences 
from a non-Cleanroom process [Selby, Basili, Baker]. The experiment was run at the 
University of Maryland with 15 three-person teams, 10 using Cleanroom. The project 
was an electronic message system (~ 1500 LOC). The teams were permitted 3 to 5 
test submissions and the data collected consisted of background and attitude surveys, 
on-line activities of the developers, and test results. 

The effect of the Cleanroom approach on the process was that the Cleanroom 
developers (1) felt they more effectively applied off-line review techniques, while 
others focused on functional testing, (2) spent less time on-line and used fewer 
computer resources, and (3) tended to make all their scheduled deliveries. 

The effect of the Cleanroom approach on the product with regard to static properties 
was that the products developed using the Cleanroom approach had less dense 
complexity, a higher percentage of assignment statements, more global data, and 
more comments. With regard to operational properties, Cleanroom products more 
completely met requirements and had a higher percentage of test cases succeed. 

Based on these results, we decided that it was worth running a case study in the SEL 
to see if the approach scaled up and how it worked with changing requirements. In 
applying the approach in the SEL, you will see an application of the QIP with regard to 
improving process. We begin with the characterization step which asks the question, 
"what relevant models exist that are available for reuse?" There were three models: 
the standard SEL model, which defines how software gets developed in the SEL in a 
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FORTRAN environment; the IBM FSD Cleanroom model that was applied on a prior 
project, and the experimental model we used for the replicated project. 

The SEL goals were to characterize and evaluate the Cleanroom approach in general 
and spedficafly with regard to changing requirements. In prior applications, 
Cleanroom had been used on projects where the requirements were basically fixed ai 
the beginning of the study. One of the questions we were often asked after the 
replicated project study was "would this technology survive in an environment with 
changing requirements?" Since we had not experimented with changing 
requirements, we could not answer the question with much confidence. 

What had been learned from the IBM/Cleanroom model application was the basic 
process model, methods and techniques and that the process very effective in the 
given environme.it. From the UoM/Cleanroom model appScation, we learned that no 
developer testing enforces better reading, the process is quite effective for small 
projects, formal methods are hard to apply and require skin, and there may be 
insufficient failure data to effectively measure reliability. 

Based upon the existing models, our goals, and the lessons learned from prior 
applications of Cleanroom, we defined an initial SEL Cleanroom process model. We 
stole what was most effective from prior applications; for example, the training was 
consistent with the University of Maryland course and we emphasized reading by at 
least two reviewers. 

Because this was a real project, and there was concern on the part of some cf the 
developers about the effectiveness of reading, e.g., that you needed to test certain 
algorithms, we allowed back-out options, e.g., you cotdd request permission to unit 
test certain types of algorithms. These back-out options were never used, but they did 
provide a comfort level for the developers. When we cfidnt know how to hancfe some 
aspect of the approach in this environment we applied the standard SEL process 
model as long as it didn't conflict in principle with what we were 'ving to do. V/e 
monitored and made changes to the process model in real-time. We wrote lessons 
learned, and we redefined the process for the next time out 

Some of the major positive results of the application of Cleanroom in the SEL delude: 
the approach scales up to a 30,000 SLOC project, it can be used with changing 
requirements, productivity increased by about 30%, the failure rale during test reduced 
to close to 50%, there was a reduction in rework effort (95% of the fixes, as opnosed to 
58%, took < 1 hour to fix), only 26% of faults found by both readers (implying two 
readers are important), there were effort distribution changes, e.g., more time n design 
and 50% of code time spent reading, code appears in library later than normal and 
more ike a step function, there was less computer use by a factor of 5. 

Negative lessons learned include the fact that better training was needed for the 
methods and techniques. The kind of training we had at the university wasn’t good 
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enough. For example, we provided training where the examples were stacks etc 
This was not appropriate for the application. (One thing we have done on the second 
two Cleanroom projects is reuse parts of the first project as examples in the training.) 
We needed better mechanisms for transferring code to testers and the testers need to 
add requirements for output analysis of code. As expected, we did not have enough 
error data (with a 30,000 line project) to seed the reliability model so there was no 
payoff in reliability modeling in the SEL. 


A side effect of this project was that it generated much more interest in improving the 
requirements. This requirements problem existed independent of Cleanroom, but the 
approach exposed the problem. So there has been a genuine push in having better 
defined requirements. 

These results were for a 30,000 line project and a particular application. Is that the 
size limit for the Cleanroom process? Suppose we try a 100,000 line project ... what 
are the limits of this particular technology? When does it start to fall apart? Even if it 
doesn't work for a given size project, that’s okay ... we now understand the bounds on 
that technology. It should not be expected that a technology works under all 
circumstances, every time, and every place. We have to understand as a community 
that technology has limits and that we have to select, and modify processes 
appropriate for the situation. 

The next two experiments will emphasize the application of the formal models more 
we are using the box structure approach, a change in the application domain for one 
project, and a scale up to a 1 00 KLOC for the other project. 

This has been an example of the Quality Improvement Paradigm in terms of a 
particular process, and in terms of experimental design moving from controlled 
experiments to case studies in a real environment, and moving from case study to 
multi-project environment. 

And we continue to evolve. 

Packaging the Experience 

We have just discussed a form of packaging, the documentation of the Cleanroom 
process model. We currently have a working document that represents the model as 
we understand it today. But it will change as we learn! 

Packaging experience requires the continual accumulation of evaluated experiences 
(learning) in a form that can be effectively understood and modified (experience 
models) into a repository of integrated experience models (experience base) that can 
be accessed and modified to meet the needs of the current project (reuse). 

Systematic learning requires support for recording experience off-line generalizing 
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and tailoring of experience formalizing experience. Off-line is a key word here. 
Packaging cannot be done as part of a project development. Someone cannot 
perform this analysis and build models at the same time they are building software. 
There needs to be a separate organization, either physically or logically separate. 

Packaging useful experience requires a variety of models and an experience base. 
The models require formal notations that are tailorable, extendible, understandable, 
flexible and accessible. An effective experience base must contain accessible and 
integrated set of analyzed, synthesized, and packaged experience models that 
captures the local experiences. 

The Experience . Factory is a logical and/or physical organization (sepa ate from 
the project organization) that supports project developments by analyzing and 
synthesizing all kinds of experience models acting as a repository for such experience 
supplying that experience to various projects on demand. It packages experience by 
building informal, formal or schematized, and productized models and measures of 
various software processes, products, and other forms of knowledge via people, 
documents, and automated support 

There are a variety of software engineering experiences that we can package: 
resource baselines and models, change and defect baselines and models, product 
baselines and models, process definitions and models, method and technique models 
and evaluations, products, lessons learned, quality models, etc. In the SEL, they exist 
in the form of standards, policies, tools. The documents range from sets of lessons 
learned to a manager's handbook. 

There are many forms of packaged experience. We can use mathematical equations 
defining the relationship between variables, e.g., Effort = a*Size b . We can present 
raw or analyzed data in the form of histograms orpie charts, e.g., % of each class of 
fault We can plot graphs defining ranges of “normal", e.g., graphs of size growth over 
time with confidence levels. We can write specific lessons learned associated with 
project types, phases, or activities, e.g., reading by stepwise abstraction is most 
effective for finding interface faults, or in the form of risks or recommendations, e.g., 
definition of a unit for unit test in Ada needs to be carefully defined. We can create' 
models or algorithms specifying the processes, methods, or techniques, e.g., an 
SADT diagram defining Design Inspections with the reading technique a variable 
dependent upon the focus and reader perspective. 

For example, in the SEL we have a whole set of equations that define the relationships 
between a variety of variables [BasiD. Panlilio-Yap). Management can use these 
equations to understand, predict, and evaluate. In the SEL, example packaged 
relationships include: 

Effort = 4.37 + 1.43devlines 
Effort = 5.5 + 1 .5newiines 
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Docpages * 99.1 + 30.9 devlines 
Numruns ■ -108 + 151 devlines 
For projects under 50 KLOC we have: 

Effort = .877 + 1 .Snowlines 
while for projects over 50 KLOC we have: 

Effort = 66.9 + .003 numruns 

We have been able to demonstrate that methodology favorable impacts software cost 
and quality but cumulative complexity unfavorable impacts these factors [Basiii a]. 

We have fault profiles that allow us to compare and analyze environments and 
projects. For example, what percent of faults of a particular type, based on a particular 
classification scheme, occur during a standard FORTRAN development. Are the 
percentages the same for an Ada development? We have been able to show that Ada 
reduces the percent of interface faults, but not by the amount one might expect based 
upon the ability of Ada compilers to check for interface faults [Brophy]. 

Conclusions 

Based upon our experiences, we need a set of experience factories or SELs, each 
focused on packaging local experiences by building and tailoring local models, 
integrating technologies, studying scale-up, building experience bases, and 
developing automated aids. 

It is still hard to answer questions like: how big should an SEL be? should the 
experience factory only be domain specific, should it focus on a homogeneous 
environment? 

If the SELs are focussed on homogeneous environments, we will need to integrate 
these local experience factories into a high level experience factory that abstracts from 
local experiences, looks for patterns across environments, and generates the basic 
models of the science. But how is this accomplished? 

What we can do now is take advantage of the experimental nature of software 
engineering. Processes, products, and environments can be measured and can be 
used to support practical development and research. The integration of the 
Improvement Paradigm, the Goal/Question/Metric Paradigm, and the Experience 
Factory Organization can provide a framework for both development and research. 

Based upon our experience, it helps us derive descriptive models of our experiences, 
understand our experiences and our problems, evaluate and learn from our 
experiences, and build effective prescriptive models of our experiences and our 
quality objectives. It can and should be applied today and evolve with technology. 

Taking advantage of the experimental nature of software engineering has provided a 
winning situation for research and development. From a researcher’s perspective the 

12 


V. Basil 

Unhr. of Maryland 
Page 12 of 42 


SEL has been a smashing success. Its evolution has been slow, we have made many 
mistakes, but we have learned a lot.You don’t have to make the same mistakes we 
did, you can learn from our experiences. 
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Software Engineering Research 

There is a great deal of software engineering 
research going on, i.e., people are building 
technologies, methods, models, etc. 


What is the problem? 

The research is mostly bottom-up, done in isolation 

It cannot be easily logically or physically integrated 

It is not aimed at solving the big problem 

It is not evaluated or analyzed via experimentation 

It is not refined and tailored to the application 
environment 

It cannot be easily transferred into practice 

We cannot understand the relationships between 
various models of the process and product 
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Software Engineering Research 


What is needed? 


A top down experimental, evolutionary framework 
in which research can be focused, logically and 
physically integrated to produce quality software 
productively, and evaluated and tailored to the 
application environment 

An experimental laboratory that is associated with 
the artifact we are studying 

We need SEL type activities to support software 
engineering research 
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What is the SEL 

from a researchers point of view? 

A laboratory that allows us to 

understand the various processes, products and 
other experiences and build descriptive models 

understand the problems associated with building 
software 

develop solutions focused on the problems, 
experiment with them and analyze and evaluate 
their effects 

refine and tailor these solutions for continual 
improvement and effectiveness and enhance our 
understanding of their efforts 

build models of software engineering experiences 
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How have the activities evolved? 


Evolving concepts for over 15 years 
Phase 1 

Worked on understanding what we could about the 
environment and measurement 
measured what we could 
collected too much data 
used available models 

blindly applied models and metrics 
tried to evaluate before understanding 
built descriptive baselines and models 
studied individual projects 
tried to characterize the environment 
developed the Goal/Question/Metric Paradigm 
informal approach to organizing data 
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How have the activities evolved? 


Phase 2 

Worked on improving the process and product 

evaluated and fed back information to project 
mostly informal data models 
data automated but not the models 
experimented with technologies 

began to understand effects locally 
developed the Quality Improvement Paradigm 
informal applied for cross project learning 
evolved the Goal/Question/Metric Paradigm 
recognized types of goals and questions 
began formalizing process, product, knowledge 
and quality models 
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How have the activities evolved? 

Phase 3 

Working on packaging experiences for reuse 
choosing potentially reusable experiences 
recognizing what is appropriate for SEL 
studying notations for defining experiences 
a project characterization model 
defining models of tailorable experiences 
a tailorable test method 
product reuse models/architectures 
defining process models for reusing experience 
defining a reuse oriented evolution model 
working on integrating experience models 
developed the Experience Factory concept and 
integrated it with an evolved QIP and GQM 
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Evolution of Measurement/Studies in the SEL 


Packaging 

SEL Ada Process 

SEL Cleanroom Process 

SME 

Managers Handbook 
Experience Factory 

Improving 

Methodology Evaluation Ada 

Cost Model Analysis OOD 

Test Technique Analysis Cleanroom 
QIP CASE 

Understanding 

r< Modeling environment Design Measures Test Method 

ll Data Collection (GQM) Cost vs. Size Complexity Reuse 

| Resource Baselines 

1 Defect Baselines 


Overview of the Current 
Framework 


Quality Improvement Paradigm 

an evolutionary improvement paradigm, based upon 
the scientific method, tailored for the software 
engineering 

Goal/Question/Metric Paradigm 

an approach for establishing project, corporate, and 
research goals and a mechanism for measuring 
against those goals 

Experience Factory 

an organization that supports research and 
development by studying projects, developing and 
refining models, and supplying them to projects for 
further analysis and refinement 
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Quality Improvement Paradigm 


Characterize the current project and its 
environment with respect to a variety of models. 

Set the quantifiable goals for successful project 
performance and improvement. 

Choose the appropriate process model and 
supporting methods and tools for this project. 

Execute the processes, construct the products, 
collect and validate the prescribed data ,and analyze 
it to provide real-time feedback for corrective 
action. 

Analyze the data to evaluate the current 
practices, determine problems, record findings, and 
make recommendations for future project 
improvements. 

Package the experience in the form of updated and 
refined models and other forms of structured 
knowledge gained from this and prior projects and 
save it in an experience base so it represents our 
current state of knowledge and is available for 
future projects. 
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The Goal Question Metric Paradigm 


A mechanism for defining and interpreting 
operational and measurable software goals 

It combines models of 

an object of study, e.g., a process, product, or 
any other experience model and 

one or more focuses, e.g., models aimed at viewing 
the object of study for particular characteristics 

that can be analyzed from a point of view, e.g., the 
perspective of the person needing the information, 
which orients the type of focus and when the 
interpretation/information is made available 

for any purpose, e.g., characterization, evaluation, 
prediction, motivation, improvement, which 
specifies the type of analysis necessary 

to generate a GQM model 

relative to a particular environment 
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Classes of Studies 
Scopes of Evaluation 


#Projects 


One 

| More than 


j one 


# of 

One 

Single Project 

| Multi-Project 

Teams ^ 


(Case Study) 

| Variation 

* 

per 

More than 

Replicated 

| Blocked 

Project 

one 

Project 

j Subject-Projec 
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Classes of Studies 
Examples 


Single Project 

1 Multi-Project 

(Case Study) 

1 Variation 


Independent V&V | Effect of Methodology 

Cleanroom Process | Resource Model Studies 


Defect Analysis Studies 
Ada/Object Oriented Design 
Code Reuse in Ada/Fortran 


Replicated 

Project 

1 Blocked 

1 Subject-Project 


Effect of Methodology 

| Reading vs. Testing 


Cleanroom Process 

1 

i 


Ada/O-O Design 

1 

1 



V.Kastfl 

tor*, of Maryland 
Pat* 27 42 


Cleanroom Process 


¥ 


Key components: 

Mathematically-based design methodology 
Function specification for programs 
State machine specification for modules 
Reading by stepwise abstraction 
Correctness demonstrations when needed 
Top-down development 

Implementation without any on-line testing by 
developer 
Independent testing 

Statistically based on anticipated 
operational use 
Quality assurance orientation 
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Replicated Cleanroom Study 

Study Goal: 

Analyze the Cleanroom process to evaluate it 
with respect to the effects on the process, 
product and developers relative to 
differences from a non-Cleanroom process 

Environment: 

University of Maryland 

Electronic message system (~ 1500 LOC) 

15 three-person teams (10 used Cleanroom) 

Empirical study: 

3 to 5 test submissions 
Data collected 
Background 
Attitude survey 
On-line activity 
Testing results 
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Replicated Cleanroom Study 




EFFECT ON PROCESS 

Cleanroom developers felt they more effectively 
applied off-line review techniques, while others 
focused on functional testing 

Cleanroom developers spent less time on-line and 
used fewer computer resources 

Cleanroom developers tended to make all their 
scheduled deliveries 

EFFECT ON PRODUCT 

Static properties: 

Less dense complexity 
Higher percentage of assignment statements 
More global data 
More comments 
Operational properties: 

Product more completely met requirements 
Higher percentage of test cases succeeded 
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» DEFINING AN SEL CLEANROOM 

r PROCESS MODEL 

r 

J Existing models: standard SEL model, 

j' IBM/FSD Gleanroom Model 

experimental UoM Cleanroom model 

1 

1 

Goals: characterize and evaluate in general, 
i and with respect to changing requirements 

* IBM/Cleanroom model lessons learned: 

I basic process model, methods and techniques 

process very effective in given environment 


lioM/Cieanroom model lessons learned: 
no te? +i ng e r orces better reading 
process quite effective for small project 
forma! methods hard to apply, require skill 
may have insufficient data to measure 
reliability 
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DEFINING AN SEL/CLEANROOM 
PROCESS MODEL (Cont.) 


Define SEL/Cleanroom process model: 

Use informal state machine and functions 
Training consistent with UoM course on process 
model, methods, and techniques 
Emphasize reading by two reviewers 
Allow back-out options for unit testing certain 
modules 

When no new information, use standard SEL 
activities 

Monitor and make changes to the process model in 
real time 

Write lessons learned for incorporation into next 
version 

Redefine process for the next execution of the 
process model 
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SOME LESSONS LEARNED USING 
CLEANROOM in the SEL 


Can scale up to 30KLOC 

Can use with changing requirements 

Failure rate during test reduced to close to 50% 

Reduction in rework effort 

95% as opposed to 58% took < 1 hour to fix 

Only 26% of faults found by both readers 

Productivity increased by about 30% 

Effort distribution changes: 
more time in design 
50% of code time spent reading 

Code appears in library 
later than normal 
more like a step function 

Less computer use by a factor of 5 
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SOME LESSONS LEARNED USING 
CLEANROOM in the SEL (Cont.) 

Better training needed for methods and techniques 


Better mechanisms needed for transferring code to 
testers 

Testers need to add requirements for output 
analysis of code 

No payoff in reliability modeling 
Side effects: 

Caused more emphasis on requirements analysis 

Define next experiments: 

Apply formal models more effectively - use box 
structure approach 

Change application domain and keep size the same 
Scale up to a 100KLOC project 


V. Basill 

Unlv. of Maryland 
Paf e 34 of 42 


Packaging the Experience 


Packaging requires the 

continual accumulation of evaluated experiences 

(learning) 

in a form that can be effectively understood 
and modified (experience models) 
into a repository of integrated experience 
models (experience base) 
that can be accessed and modified to meet the 
needs of the current project (reuse) 

Systematic learning requires support for 
recording experience 

off-line generalizing and tailoring of experience 
formalizing experience 

Packaging useful experience requires 

a variety of models and formal notations that 
are tailorable, extendible, understandable, 
flexible and accessible 

An effective experience base must contain 
accessible and integrated set of analyzed, 
synthesized, and packaged experience models 
that captures the local experiences 
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The Experience Factory 


f 


Logical and/or physical organization (separate from 
the project organization) that supports project 
developments by 

analyzing and synthesizing all kinds of 
experience models 

acting as a repository for such experience 

supplying that experience to various projects 
on demand 


It packages experience by building 

informal, formal or schematized, and 
productized models and measures 

of various software processes, products, and 
other forms of knowledge 

via people, documents, and automated support 
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What kinds of experience can wg 

package? 


Resource Baselines and Models 
Change and Defect Baselines and Models 
Product Baselines and Models 
Process Definitions and Models 

Method and Technique Models and Evaluations 
Products 

Lessons Learned 
Quality Models 

In the SEL, they exist in the form of standards, 
policies, tools 
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Forms of Packaged Experience 


¥ 


Equations defining the relationship between 
variables, 

e.g. Effort = a*Size b 

Histograms or pie charts of raw or analyzed data 
e.g. % of each class of fault 

Graphs defining ranges of “normal" 

e.g. graphs of size growth over time with 
confidence levels 

Specific lessons learned 

associated with project types, phases, activities 
e.g. reading by stepwise abstraction is most 
effective for finding interface faults 
in the form of risks or recommendations 
e.g. definition of a unit for unit test in Ada 
needs to be carefully defined 

models or algorithms specifying the processes, 
methods, or techniques 

e.g. an SADT diagram defining Design 

Inspections with the reading technique a 
variable dependent upon the focus and 
reader perspective 
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PACKAGING EXPERIENCE: 

RESOURCE MODELS 


!n the SEL, 

Example packaged relationships include: 

Effort = 4.37 + 1.43devlines 
Effort = 5.5 + 1.5newlines 
Docpages = 99.1 + 30.9 devlines 
Numruns = -108 + 151devlines 

Projects under 50kloc: 

Effort = .877 + I.Snewlines 
Projects over 50kloc 
Effort = 66.9 + .003 numruns 

Factors that affect cost and quality are: 
+methodology (favorable impact) 

-cumulative complexity (unfavorable impact) 
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CLASSES OF ERROR* 


FORTRAN 


Ada 



•ERROR PROFILES QUITE SIMILAR; EVEN FOR DIFFERENT LANGUAGES 
•Ada SOMEWHAT FEWER INTERFACE ERRORS 
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* BAS ED ON ERROR FROM 5 Ada AND 6 FORTRAN PROJECTS 




Research Laboratory Needs 


We need a set of SELs or Experience Factories 
each focused on packaging local experiences by 
building and tailoring local models 
integrating technologies 
studying scale-up 
building an experience bases 
developing automated aids 
How big should an SEL be? 

Should it only be domain specific? 
and 

the integration of these local experience factories 
into a high level Experience Factory that 
abstract from local experiences 
looks for patterns across environments 
generates the basic models of the science 
How is this accomplished? 
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Conclusions 


We can lake advantage of the experimental nature of 
software engineering 

Process, product, environment can be measured and 
can be used to support practical development and 
research 

Integration of the 

Improvement Paradigm 
Goal/Question/Metric Paradigm 
Experience Factory Organization 
provides a framework for both development and 
research 

Based upon our experience, it helps us 
derive descriptive models of our experiences 
understand our experiences and our problems 
evaluate and learn from our experiences 
build effective prescriptive models of our 
experiences and our quality objectives 


Should be applied today and evolve with technology 

You don’t have to make the same mistakes we did, 
you can learn from our experiences 
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